Appropriate Kernel Functions for Support Vector Machine Learning with Sequences of Symbolic Data

نویسندگان

  • Bram Vanschoenwinkel
  • Bernard Manderick
چکیده

In classification problems, machine learning algorithms often make use of the assumption that (dis)similar inputs lead to (dis)similar outputs. In this case, two questions naturally arise: what does it mean for two inputs to be similar and how can this be used in a learning algorithm? In support vector machines, similarity between input examples is implicitly expressed by a kernel function that calculates inner products in the feature space. For numerical input examples the concept of an inner product is easy to define, for discrete structures like sequences of symbolic data however these concepts are less obvious. This article describes an approach to SVM learning for symbolic data that can serve as an alternative to the bag-of-words approach under certain circumstances. This latter approach first transforms symbolic data to vectors of numerical data which are then used as arguments for one of the standard kernel functions. In contrast, we will propose kernels that operate on the symbolic data directly.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MODELING OF FLOW NUMBER OF ASPHALT MIXTURES USING A MULTI–KERNEL BASED SUPPORT VECTOR MACHINE APPROACH

Flow number of asphalt–aggregate mixtures as an explanatory factor has been proposed in order to assess the rutting potential of asphalt mixtures. This study proposes a multiple–kernel based support vector machine (MK–SVM) approach for modeling of flow number of asphalt mixtures. The MK–SVM approach consists of weighted least squares–support vector machine (WLS–SVM) integrating two kernel funct...

متن کامل

Remote Sensing and Land Use Extraction for Kernel Functions Analysis by Support Vector Machines with ASTER Multispectral Imagery

Land use is being considered as an element in determining land change studies, environmental planning and natural resource applications. The Earth’s surface Study by remote sensing has many benefits such as, continuous acquisition of data, broad regional coverage, cost effective data, map accurate data, and large archives of historical data. To study land use / cover, remote sensing as an effic...

متن کامل

Detection of Glioblastoma Multiforme Tumor in Magnetic Resonance Spectroscopy Based on Support Vector Machine

Introduction: The brain tumor is an abnormal growth of tissue in the brain, which is one of the most important challenges in neurology. Brain tumors have different types. Some brain tumors are benign and some brain tumors are cancerous and malignant. Glioblastoma Multiforme (GBM) is the most common and deadliest malignant brain tumor in adults. The average survival rate for peo...

متن کامل

Separating Well Log Data to Train Support Vector Machines for Lithology Prediction in a Heterogeneous Carbonate Reservoir

The prediction of lithology is necessary in all areas of petroleum engineering. This means that to design a project in any branch of petroleum engineering, the lithology must be well known. Support vector machines (SVM’s) use an analytical approach to classification based on statistical learning theory, the principles of structural risk minimization, and empirical risk minimization. In this res...

متن کامل

PREDICTION OF SLOPE STABILITY STATE FOR CIRCULAR FAILURE: A HYBRID SUPPORT VECTOR MACHINE WITH HARMONY SEARCH ALGORITHM

The slope stability analysis is routinely performed by engineers to estimate the stability of river training works, road embankments, embankment dams, excavations and retaining walls. This paper presents a new approach to build a model for the prediction of slope stability state. The support vector machine (SVM) is a new machine learning method based on statistical learning theory, which can so...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004